Aggregation in Probabilistic Databases via Knowledge Compilation
نویسندگان
چکیده
This paper presents a query evaluation technique for positive relational algebra queries with aggregates on a representation system for probabilistic data based on the algebraic structures of semiring and semimodule. The core of our evaluation technique is a procedure that compiles semimodule and semiring expressions into so-called decomposition trees, for which the computation of the probability distribution can be done in time linear in the product of the sizes of the probability distributions represented by its nodes. We give syntactic characterisations of tractable queries with aggregates by exploiting the connection between query tractability and polynomial-time decomposition trees. A prototype of the technique is incorporated in the probabilistic database engine SPROUT. We report on performance experiments with custom datasets and TPC-H data.
منابع مشابه
10 Years of Probabilistic Querying - What Next?
Over the past decade, the two research areas of probabilistic databases and probabilistic programming have intensively studied the problem of making structured probabilistic inference scalable, but—so far—both areas developed almost independently of one another. While probabilistic databases have focused on describing tractable query classes based on the structure of query plans and data lineag...
متن کاملNew Limits for Knowledge Compilation and Applications to Exact Model Counting
We show new limits on the efficiency of using current techniques to make exact probabilistic inference for large classes of natural problems. In particular we show new lower bounds on knowledge compilation to SDD and DNNF forms. We give strong lower bounds on the complexity of SDD representations by relating SDD size to best-partition communication complexity. We use this relationship to prove ...
متن کاملScaling Lifted Probabilistic Inference and Learning Via Graph Databases
Over the past decade, exploiting relations and symmetries within probabilistic models has been proven to be surprisingly effective at solving large scale data mining problems. One of the key operations inside these lifted approaches is counting be it for parameter/structure learning or for efficient inference. Typically, however, they just count exploiting the logical structure using adhoc oper...
متن کاملConditioning in First-Order Knowledge Compilation and Lifted Probabilistic Inference
Knowledge compilation is a powerful technique for compactly representing and efficiently reasoning about logical knowledge bases. It has been successfully applied to numerous problems in artificial intelligence, such as probabilistic inference and conformant planning. Conditioning, which updates a knowledge base with observed truth values for some propositions, is one of the fundamental operati...
متن کاملCompiling Probabilistic Logic Programs into Sentential Decision Diagrams
Knowledge compilation algorithms transform a probabilistic logic program into a circuit representation that permits efficient probability computation. Knowledge compilation underlies algorithms for exact probabilistic inference and parameter learning in several languages, including ProbLog, PRISM, and LPADs. Developing such algorithms involves a choice, of which circuit language to target, and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 5 شماره
صفحات -
تاریخ انتشار 2012